From Regular Expressions to Deterministic Automata

نویسندگان

  • Gérard Berry
  • Ravi Sethi
چکیده

The main theorem allows an elegant algorithm to be refined into an efficient one. The elegant algorithm for constructing a finite automaton from a regular expression is based on 'derivatives of' regular expressions; the efficient algorithm is based on 'marking of' regular expressions. Derivatives of regular expressions correspond to state transitions in finite automata. When a finite automaton makes a transition under input symbol a, a leading a is stripped from the remaining input. Correspondingly, if the input string is generated by a regular expression E, then the derivative of E by a generates the remaining input after a leading a is stripped. Brzozowski (1964) used derivatives to construct finite automata; the state for expression E has a transition under a to the state for the derivative of E by a. This approach extends to regular expressions with new operators, including intersection and complement; however, explicit computation of derivatives can be expensive. Marking of regular'expressions yields an expression with distinct input symbols. Following MeNaughton and Yamada (1960), we attach subscripts to each input symbol in an expression; (ab+b)*ba becomes (atb2+b3)*b4as. Conceptually, the efficient algorithm constructs an automaton for the marked expression. The marks on the transitions are then erased, resulting in a nondeterministic automaton for the original unmarked expression. This approach works for the usual operations of union, concatenation, and iteration; however, intersection and complement cannot be handled because marking and unmarking do not preserve the languages generated by regular expressions with these operators.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel algorithm for the conversion of shuffle regular expressions into non-deterministic finite automata

Regular expressions with shuffle operators are widely used in diverse fields of computer science. The work presented here investigates the shuffling of regular expressions and their conversion into non-deterministic finite automata. The aim of the paper is to design a novel algorithm for constructing  -free non-deterministic finite automata from the shuffling of regular expressions. Non-determ...

متن کامل

Block-Deterministic Regular Languages

We introduce the notions of blocked, block-marked and block-deterministic regular expressions. We characterize block-deterministic regular expressions with determin-istic Glushkov block automata. The results can be viewed as a generalization of the characterization of one-unambiguous regular expressions with deterministic Glushkov automata. In addition, when a language L has a block-determinist...

متن کامل

Model Checking Regular Language Constraints

Even the fastest SMT solvers have performance problems with regular expressions from real programs. Because these performance issues often arise from the problem representation (e.g. non-deterministic finite automata get determinized and regular expressions get unrolled), we revisit Boolean finite automata, which allow for the direct and natural representation of any Boolean combination of regu...

متن کامل

Learning Regular Languages via Alternating Automata

Nearly all algorithms for learning an unknown regular language, in particular the popular L∗ algorithm, yield deterministic finite automata. It was recently shown that the ideas of L∗ can be extended to yield non-deterministic automata, and that the respective learning algorithm, NL∗, outperforms L∗ on randomly generated regular expressions. We conjectured that this is due to the existential na...

متن کامل

A Bialgebraic Review of Regular Expressions, Deterministic Automata and Languages

This papers reviews the classical theory of deterministic automata and regular languages from a categorical perspective. The basis is formed by Rutten's description of the Brzozowski automaton structure in a coalgebraic framework. We enlarge the framework to a so-called bialgebraic one, by including algebras together with suitable distributive laws connecting the algebraic and coalgebraic struc...

متن کامل

A Novel Algorithm for the Conversion of Parallel Regular Expressions to Non-deterministic Finite Automata

The aim of the paper is to concoct a novel algorithm for the metamorphosis of parallel regular expressions to ε-free nondeterministic finite automata. For a given parallel regular expression r, let m be the number of symbols that occur in r and let C denote the number of concatenation operators in r. In the worst case, 2m+1 states are required for the construction of the non-deterministic finit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 48  شماره 

صفحات  -

تاریخ انتشار 1986